Data Cleaning & Manipulation

Monday, April 17

Today we will…

Example Data set – Cereal

library(liver)
data(cereal)
str(cereal, give.attr = FALSE)
'data.frame':   77 obs. of  16 variables:
 $ name    : Factor w/ 77 levels "100% Bran","100% Natural Bran",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ manuf   : Factor w/ 7 levels "A","G","K","N",..: 4 6 3 3 7 2 3 2 7 5 ...
 $ type    : Factor w/ 2 levels "cold","hot": 1 1 1 1 1 1 1 1 1 1 ...
 $ calories: int  70 120 70 50 110 110 110 130 90 90 ...
 $ protein : int  4 3 4 4 2 2 2 3 2 3 ...
 $ fat     : int  1 5 1 0 2 2 0 2 1 0 ...
 $ sodium  : int  130 15 260 140 200 180 125 210 200 210 ...
 $ fiber   : num  10 2 9 14 1 1.5 1 2 4 5 ...
 $ carbo   : num  5 8 7 8 14 10.5 11 18 15 13 ...
 $ sugars  : int  6 8 5 0 8 10 14 8 6 5 ...
 $ potass  : int  280 135 320 330 -1 70 30 100 125 190 ...
 $ vitamins: int  25 0 25 25 25 25 25 25 25 25 ...
 $ shelf   : int  3 3 3 3 3 1 2 3 1 3 ...
 $ weight  : num  1 1 1 1 1 1 1 1.33 1 1 ...
 $ cups    : num  0.33 1 0.33 0.5 0.75 0.75 1 0.75 0.67 0.67 ...
 $ rating  : num  68.4 34 59.4 93.7 34.4 ...
head(cereal)
                       name manuf type calories protein fat sodium fiber carbo
1                 100% Bran     N cold       70       4   1    130  10.0   5.0
2         100% Natural Bran     Q cold      120       3   5     15   2.0   8.0
3                  All-Bran     K cold       70       4   1    260   9.0   7.0
4 All-Bran with Extra Fiber     K cold       50       4   0    140  14.0   8.0
5            Almond Delight     R cold      110       2   2    200   1.0  14.0
6   Apple Cinnamon Cheerios     G cold      110       2   2    180   1.5  10.5
  sugars potass vitamins shelf weight cups   rating
1      6    280       25     3      1 0.33 68.40297
2      8    135        0     3      1 1.00 33.98368
3      5    320       25     3      1 0.33 59.42551
4      0    330       25     3      1 0.50 93.70491
5      8     -1       25     3      1 0.75 34.38484
6     10     70       25     1      1 0.75 29.50954
summary(cereal)
                        name    manuf    type       calories    
 100% Bran                : 1   A: 1   cold:74   Min.   : 50.0  
 100% Natural Bran        : 1   G:22   hot : 3   1st Qu.:100.0  
 All-Bran                 : 1   K:23             Median :110.0  
 All-Bran with Extra Fiber: 1   N: 6             Mean   :106.9  
 Almond Delight           : 1   P: 9             3rd Qu.:110.0  
 Apple Cinnamon Cheerios  : 1   Q: 8             Max.   :160.0  
 (Other)                  :71   R: 8                            
    protein           fat            sodium          fiber       
 Min.   :1.000   Min.   :0.000   Min.   :  0.0   Min.   : 0.000  
 1st Qu.:2.000   1st Qu.:0.000   1st Qu.:130.0   1st Qu.: 1.000  
 Median :3.000   Median :1.000   Median :180.0   Median : 2.000  
 Mean   :2.545   Mean   :1.013   Mean   :159.7   Mean   : 2.152  
 3rd Qu.:3.000   3rd Qu.:2.000   3rd Qu.:210.0   3rd Qu.: 3.000  
 Max.   :6.000   Max.   :5.000   Max.   :320.0   Max.   :14.000  
                                                                 
     carbo          sugars           potass          vitamins     
 Min.   :-1.0   Min.   :-1.000   Min.   : -1.00   Min.   :  0.00  
 1st Qu.:12.0   1st Qu.: 3.000   1st Qu.: 40.00   1st Qu.: 25.00  
 Median :14.0   Median : 7.000   Median : 90.00   Median : 25.00  
 Mean   :14.6   Mean   : 6.922   Mean   : 96.08   Mean   : 28.25  
 3rd Qu.:17.0   3rd Qu.:11.000   3rd Qu.:120.00   3rd Qu.: 25.00  
 Max.   :23.0   Max.   :15.000   Max.   :330.00   Max.   :100.00  
                                                                  
     shelf           weight          cups           rating     
 Min.   :1.000   Min.   :0.50   Min.   :0.250   Min.   :18.04  
 1st Qu.:1.000   1st Qu.:1.00   1st Qu.:0.670   1st Qu.:33.17  
 Median :2.000   Median :1.00   Median :0.750   Median :40.40  
 Mean   :2.208   Mean   :1.03   Mean   :0.821   Mean   :42.67  
 3rd Qu.:3.000   3rd Qu.:1.00   3rd Qu.:1.000   3rd Qu.:50.83  
 Max.   :3.000   Max.   :1.50   Max.   :1.500   Max.   :93.70  
                                                               

Data Wrangling with dplyr

dplyr

dplyr provides us with the “Grammar of Data Manipulation”.

  • This package gives us the tools to wrangle, manipulate, and tidy our data with ease.
  • Check out the dplyr cheatsheet.

Data wrangling by Allison Horst

dplyr verbs

  • filter()select rows based on their values
  • arrange()sort rows based on their values
  • select()select columns
  • mutate()add new columns by transforming other columns
  • summarize() – perform summary operations on columns
  • group_by() – facilitate group-wise operations

Use the pipe operator (|> or %>%) to chain together data wrangling operations.

The Pipe Operator

No matter how complex and polished the individual operations are, it is often the quality of the glue that most directly determines the power of the system.

— Hal Abelson

The Pipe Operator

  • With dplyr, your code should read like a sentence.

  • The data is the primary object in your sentence, so it should come first in your code.

  • The pipe operator is an important part of that readability.

The Pipe Operator

  • The pipe specifies a sequence of operations.
  • The output from one operation is passed into the first argument of the next operation.
  • The “original” pipe: %>%

    • Loaded with tidyverse package (part of magrittr).
  • The “native” pipe: |>

    • Created in R version 4.1.0.
    • Tools > Global Options... > Code > check Use native pipe operator box.

The Pipe Operator

dr_rehnberg |>
  play_a_sport()


dr_rehnberg |>
  put_on("cleats") |>
  play_a_sport(type = "soccer")

Data Comes First!

  • filter(data = cereal, ...)
  • select(data = cereal, ...)
  • mutate(data = cereal, ...)

These are equivalent:

summary(data = cereal)
cereal |> 
  summary()


The pipe operator is your friend!

You can also pipe manipulated data or summaries directly into your ggplot2 code for plotting.

filter()

dplyr filter() by Allison Horst

filter()

We filter to the rows (observations) we would like to keep in the data.

cereal |> 
  filter(sugars < 5)
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8 0 330 25 3 1.00 0.50 93.70491
Cheerios G cold 110 6 2 290 2.0 17 1 105 25 1 1.00 1.25 50.76500
Corn Chex R cold 110 2 0 280 0.0 22 3 25 25 1 1.00 1.00 41.44502
Corn Flakes K cold 100 2 0 290 1.0 21 2 35 25 1 1.00 1.00 45.86332
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21 0 -1 0 2 1.00 1.00 64.53382
Crispix K cold 110 2 0 220 1.0 21 3 30 25 3 1.00 1.00 46.89564
Grape-Nuts P cold 110 3 0 170 3.0 17 3 90 25 3 1.00 0.25 53.37101
Great Grains Pecan P cold 120 3 3 75 3.0 13 4 100 25 3 1.00 0.33 45.81172
Kix G cold 110 2 1 260 0.0 21 3 40 25 2 1.00 1.50 39.24111
Maypo A hot 100 4 1 0 0.0 16 3 95 25 2 1.00 1.00 54.85092
Nutri-grain Wheat K cold 90 3 0 170 3.0 18 2 90 25 3 1.00 1.00 59.64284
Product 19 K cold 100 3 0 320 1.0 20 3 45 100 3 1.00 1.00 41.50354
Puffed Rice Q cold 50 1 0 0 0.0 13 0 15 0 3 0.50 1.00 60.75611
Puffed Wheat Q cold 50 2 0 0 1.0 10 0 50 0 3 0.50 1.00 63.00565
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1 -1 110 0 1 1.00 0.67 50.82839
Rice Chex R cold 110 1 0 240 0.0 23 2 30 25 1 1.00 1.13 41.99893
Rice Krispies K cold 110 2 0 290 0.0 22 3 35 25 1 1.00 1.00 40.56016
Shredded Wheat N cold 80 2 0 0 3.0 16 0 95 0 1 0.83 1.00 68.23588
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19 0 140 0 1 1.00 0.67 74.47295
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20 0 120 0 1 1.00 0.67 72.80179
Special K K cold 110 6 0 230 1.0 16 3 55 25 1 1.00 1.00 53.13132
Total Corn Flakes G cold 110 2 1 200 0.0 21 3 35 100 3 1.00 1.00 38.83975
Total Whole Grain G cold 100 3 1 200 3.0 16 3 110 100 3 1.00 1.00 46.65884
Triples G cold 110 2 1 250 0.0 21 3 60 25 3 1.00 0.75 39.10617
Wheat Chex R cold 100 3 1 230 3.0 17 3 115 25 1 1.00 0.67 49.78744
Wheaties G cold 100 3 1 200 3.0 17 3 110 25 1 1.00 1.00 51.59219

filter()

We can add multiple filters to our data, to get a more specific subset.

cereal |> 
  filter(sugars < 5,
         type == "hot")
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21 0 -1 0 2 1 1.00 64.53382
Maypo A hot 100 4 1 0 0.0 16 3 95 25 2 1 1.00 54.85092
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1 -1 110 0 1 1 0.67 50.82839

filter(): Handy Helpers!

  • > – greater than
  • < – less than
  • == – equal to
  • ! – not
  • %in% – checks if an element belongs to a vector
  • is.na() – binary evaluation of missing values
  • & and , – and
  • | – or

filter(): |

cereal |> 
  filter(sugars < 5,
         type == "hot")
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21 0 -1 0 2 1 1.00 64.53382
Maypo A hot 100 4 1 0 0.0 16 3 95 25 2 1 1.00 54.85092
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1 -1 110 0 1 1 0.67 50.82839

What if I wanted either non-sugary cereals or hot cereals…

Code
cereal |> 
  filter(sugars < 5 |
           type == "hot")

filter(): %in%

Are you interested in observations with values in a list of levels?

cereal |> 
  filter(name %in% c("Cheerios", "Cinnamon Toast Crunch", "Raisin Bran", "Cracklin' Oat Bran"))
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
Cheerios G cold 110 6 2 290 2 17 1 105 25 1 1.00 1.25 50.76500
Cinnamon Toast Crunch G cold 120 1 3 210 0 13 9 45 25 2 1.00 0.75 19.82357
Cracklin' Oat Bran K cold 110 3 3 140 4 10 7 160 25 3 1.00 0.50 40.44877
Raisin Bran K cold 120 3 1 210 5 14 12 240 25 2 1.33 0.75 39.25920

How do we “filter” in base R?

You can use the subset() function!

cereal |> 
  subset(name %in% c("Cheerios", "Cinnamon Toast Crunch", "Raisin Bran", "Cracklin' Oat Bran"))
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
12 Cheerios G cold 110 6 2 290 2 17 1 105 25 1 1.00 1.25 50.76500
13 Cinnamon Toast Crunch G cold 120 1 3 210 0 13 9 45 25 2 1.00 0.75 19.82357
20 Cracklin' Oat Bran K cold 110 3 3 140 4 10 7 160 25 3 1.00 0.50 40.44877
59 Raisin Bran K cold 120 3 1 210 5 14 12 240 25 2 1.33 0.75 39.25920
cereal |> 
  subset(sugars < 5 & type == "hot")
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
21 Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21 0 -1 0 2 1 1.00 64.53382
44 Maypo A hot 100 4 1 0 0.0 16 3 95 25 2 1 1.00 54.85092
58 Quaker Oatmeal Q hot 100 5 2 0 2.7 -1 -1 110 0 1 1 0.67 50.82839

arrange()

arrange()

We arrange the rows of the data in order of a particular variable.


cereal |> 
  arrange(sodium)
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756
Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354

arrange()

We can arrange by multiple variables.


cereal |> 
  arrange(sodium, sugars) |>
  select(c(1:3,7, 10))
name manuf type sodium sugars
Quaker Oatmeal Q hot 0 -1
Puffed Rice Q cold 0 0
Puffed Wheat Q cold 0 0
Shredded Wheat N cold 0 0
Shredded Wheat 'n'Bran N cold 0 0
Shredded Wheat spoon size N cold 0 0
Maypo A hot 0 3
Raisin Squares K cold 0 6
Frosted Mini-Wheats K cold 0 7
Strawberry Fruit Wheats N cold 15 5
100% Natural Bran Q cold 15 8
Golden Crisp P cold 45 15
Smacks K cold 70 15
Great Grains Pecan P cold 75 4
Cream of Wheat (Quick) N hot 80 0
Corn Pops K cold 90 12
Muesli Raisins; Dates; & Almonds R cold 95 11
Froot Loops K cold 125 13
Apple Jacks K cold 125 14
100% Bran N cold 130 6
Quaker Oat Squares Q cold 135 6
Fruity Pebbles P cold 135 12
All-Bran with Extra Fiber K cold 140 0
Grape Nuts Flakes P cold 140 5
Clusters G cold 140 7
Cracklin' Oat Bran K cold 140 7
Raisin Nut Bran G cold 140 8
Crispy Wheat & Raisins G cold 140 10
Trix G cold 140 12
Life Q cold 150 6
Muesli Raisins; Peaches; & Pecans R cold 150 11
Mueslix Crispy Blend K cold 150 13
Fruit & Fibre Dates; Walnuts; and Oats P cold 160 10
Nutri-grain Wheat K cold 170 2
Grape-Nuts P cold 170 3
Just Right Crunchy Nuggets K cold 170 6
Just Right Fruit & Nut K cold 170 9
Oatmeal Raisin Crisp G cold 170 10
Apple Cinnamon Cheerios G cold 180 10
Honey-comb P cold 180 11
Lucky Charms G cold 180 12
Cocoa Puffs G cold 180 13
Count Chocula G cold 180 13
Double Chex R cold 190 5
Nut&Honey Crunch K cold 190 9
Total Raisin Bran G cold 190 14
Total Corn Flakes G cold 200 3
Total Whole Grain G cold 200 3
Wheaties G cold 200 3
Bran Chex R cold 200 6
Almond Delight R cold 200 8
Wheaties Honey Gold G cold 200 8
Frosted Flakes K cold 200 11
Post Nat. Raisin Bran P cold 200 14
Bran Flakes P cold 210 5
Basic 4 G cold 210 8
Cinnamon Toast Crunch G cold 210 9
Raisin Bran K cold 210 12
Crispix K cold 220 3
Multi-Grain Cheerios G cold 220 6
Nutri-Grain Almond-Raisin K cold 220 7
Honey Graham Ohs Q cold 220 11
Cap'n'Crunch Q cold 220 12
Special K K cold 230 3
Wheat Chex R cold 230 3
Rice Chex R cold 240 2
Fruitful Bran K cold 240 12
Triples G cold 250 3
Honey Nut Cheerios G cold 250 10
Kix G cold 260 3
All-Bran K cold 260 5
Corn Chex R cold 280 3
Golden Grahams G cold 280 9
Cheerios G cold 290 1
Corn Flakes K cold 290 2
Rice Krispies K cold 290 3
Product 19 K cold 320 3

arrange(): Descending Order

Default is ascending order…

cereal |> 
  arrange(sodium)


…but can add desc() to get descending order!

cereal |> 
  arrange(desc(sodium))

slice_max()

Selects the n rows with the maximum values of the specified variable.

cereal |> 
  slice_max(order_by = sugars, n = 3)
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
Golden Crisp P cold 100 2 0 45 0 11 15 40 25 1 1.00 0.88 35.25244
Smacks K cold 110 2 1 70 1 9 15 40 25 2 1.00 0.75 31.23005
Apple Jacks K cold 110 2 0 125 1 11 14 30 25 2 1.00 1.00 33.17409
Post Nat. Raisin Bran P cold 120 3 1 200 6 11 14 260 25 3 1.33 0.67 37.84059
Total Raisin Bran G cold 140 3 1 190 4 15 14 230 100 3 1.50 1.00 28.59278
cereal |> 
  slice_max(order_by = sugars, n = 3, with_ties = FALSE)

How do we “arrange” in base R?

You can use the order() function!

cereal[order(cereal$sodium),]
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
27 Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
44 Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
55 Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
56 Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
58 Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
61 Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
64 Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
65 Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
66 Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
2 100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
69 Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
31 Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
67 Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
35 Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
21 Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
18 Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
45 Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
7 Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
25 Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
1 100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
30 Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
57 Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
4 All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
14 Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
20 Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
23 Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
33 Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
60 Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
74 Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
42 Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
46 Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
47 Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
28 Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
34 Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
39 Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
40 Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
51 Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
52 Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
6 Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
15 Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
19 Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
38 Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
43 Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
24 Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
49 Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
71 Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
5 Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
9 Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
26 Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
53 Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
70 Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
72 Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
76 Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
77 Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756
8 Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
10 Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
13 Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
59 Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
11 Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
22 Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
36 Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
48 Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
50 Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
68 Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
75 Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
29 Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
62 Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
37 Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
73 Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
3 All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
41 Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
16 Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
32 Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
12 Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
17 Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
63 Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
54 Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354
cereal[order(cereal$sodium, cereal$sugars),]
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
58 Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
55 Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
56 Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
64 Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
65 Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
66 Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
44 Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
61 Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
27 Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
69 Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
2 100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
31 Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
67 Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
35 Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
21 Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
18 Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
45 Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
25 Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
7 Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
1 100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
57 Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
30 Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
4 All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
33 Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
14 Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
20 Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
60 Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
23 Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
74 Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
42 Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
46 Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
47 Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
28 Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
51 Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
34 Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
39 Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
40 Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
52 Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
6 Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
38 Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
43 Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
15 Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
19 Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
24 Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
49 Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
71 Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
70 Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
72 Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
76 Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
9 Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
5 Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
77 Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756
26 Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
53 Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
10 Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
8 Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
13 Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
59 Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
22 Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
48 Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
50 Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
36 Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
11 Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
68 Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
75 Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
62 Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
29 Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
73 Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
37 Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
41 Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
3 All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
16 Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
32 Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
12 Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
17 Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
63 Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
54 Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354

select()

select()

We select which variables we would like to remain in the data.

cereal |> 
  select(name, manuf, calories, cups)
name manuf calories cups
100% Bran N 70 0.33
100% Natural Bran Q 120 1.00
All-Bran K 70 0.33
All-Bran with Extra Fiber K 50 0.50
Almond Delight R 110 0.75
Apple Cinnamon Cheerios G 110 0.75
Apple Jacks K 110 1.00
Basic 4 G 130 0.75
Bran Chex R 90 0.67
Bran Flakes P 90 0.67
Cap'n'Crunch Q 120 0.75
Cheerios G 110 1.25
Cinnamon Toast Crunch G 120 0.75
Clusters G 110 0.50
Cocoa Puffs G 110 1.00
Corn Chex R 110 1.00
Corn Flakes K 100 1.00
Corn Pops K 110 1.00
Count Chocula G 110 1.00
Cracklin' Oat Bran K 110 0.50
Cream of Wheat (Quick) N 100 1.00
Crispix K 110 1.00
Crispy Wheat & Raisins G 100 0.75
Double Chex R 100 0.75
Froot Loops K 110 1.00
Frosted Flakes K 110 0.75
Frosted Mini-Wheats K 100 0.80
Fruit & Fibre Dates; Walnuts; and Oats P 120 0.67
Fruitful Bran K 120 0.67
Fruity Pebbles P 110 0.75
Golden Crisp P 100 0.88
Golden Grahams G 110 0.75
Grape Nuts Flakes P 100 0.88
Grape-Nuts P 110 0.25
Great Grains Pecan P 120 0.33
Honey Graham Ohs Q 120 1.00
Honey Nut Cheerios G 110 0.75
Honey-comb P 110 1.33
Just Right Crunchy Nuggets K 110 1.00
Just Right Fruit & Nut K 140 0.75
Kix G 110 1.50
Life Q 100 0.67
Lucky Charms G 110 1.00
Maypo A 100 1.00
Muesli Raisins; Dates; & Almonds R 150 1.00
Muesli Raisins; Peaches; & Pecans R 150 1.00
Mueslix Crispy Blend K 160 0.67
Multi-Grain Cheerios G 100 1.00
Nut&Honey Crunch K 120 0.67
Nutri-Grain Almond-Raisin K 140 0.67
Nutri-grain Wheat K 90 1.00
Oatmeal Raisin Crisp G 130 0.50
Post Nat. Raisin Bran P 120 0.67
Product 19 K 100 1.00
Puffed Rice Q 50 1.00
Puffed Wheat Q 50 1.00
Quaker Oat Squares Q 100 0.50
Quaker Oatmeal Q 100 0.67
Raisin Bran K 120 0.75
Raisin Nut Bran G 100 0.50
Raisin Squares K 90 0.50
Rice Chex R 110 1.13
Rice Krispies K 110 1.00
Shredded Wheat N 80 1.00
Shredded Wheat 'n'Bran N 90 0.67
Shredded Wheat spoon size N 90 0.67
Smacks K 110 0.75
Special K K 110 1.00
Strawberry Fruit Wheats N 90 1.00
Total Corn Flakes G 110 1.00
Total Raisin Bran G 140 1.00
Total Whole Grain G 100 1.00
Triples G 110 0.75
Trix G 110 1.00
Wheat Chex R 100 0.67
Wheaties G 100 1.00
Wheaties Honey Gold G 110 0.75

select()

You can use : to select a sequence of columns.

cereal |> 
  select(name:calories)
name manuf type calories
100% Bran N cold 70
100% Natural Bran Q cold 120
All-Bran K cold 70
All-Bran with Extra Fiber K cold 50
Almond Delight R cold 110
Apple Cinnamon Cheerios G cold 110
Apple Jacks K cold 110
Basic 4 G cold 130
Bran Chex R cold 90
Bran Flakes P cold 90
Cap'n'Crunch Q cold 120
Cheerios G cold 110
Cinnamon Toast Crunch G cold 120
Clusters G cold 110
Cocoa Puffs G cold 110
Corn Chex R cold 110
Corn Flakes K cold 100
Corn Pops K cold 110
Count Chocula G cold 110
Cracklin' Oat Bran K cold 110
Cream of Wheat (Quick) N hot 100
Crispix K cold 110
Crispy Wheat & Raisins G cold 100
Double Chex R cold 100
Froot Loops K cold 110
Frosted Flakes K cold 110
Frosted Mini-Wheats K cold 100
Fruit & Fibre Dates; Walnuts; and Oats P cold 120
Fruitful Bran K cold 120
Fruity Pebbles P cold 110
Golden Crisp P cold 100
Golden Grahams G cold 110
Grape Nuts Flakes P cold 100
Grape-Nuts P cold 110
Great Grains Pecan P cold 120
Honey Graham Ohs Q cold 120
Honey Nut Cheerios G cold 110
Honey-comb P cold 110
Just Right Crunchy Nuggets K cold 110
Just Right Fruit & Nut K cold 140
Kix G cold 110
Life Q cold 100
Lucky Charms G cold 110
Maypo A hot 100
Muesli Raisins; Dates; & Almonds R cold 150
Muesli Raisins; Peaches; & Pecans R cold 150
Mueslix Crispy Blend K cold 160
Multi-Grain Cheerios G cold 100
Nut&Honey Crunch K cold 120
Nutri-Grain Almond-Raisin K cold 140
Nutri-grain Wheat K cold 90
Oatmeal Raisin Crisp G cold 130
Post Nat. Raisin Bran P cold 120
Product 19 K cold 100
Puffed Rice Q cold 50
Puffed Wheat Q cold 50
Quaker Oat Squares Q cold 100
Quaker Oatmeal Q hot 100
Raisin Bran K cold 120
Raisin Nut Bran G cold 100
Raisin Squares K cold 90
Rice Chex R cold 110
Rice Krispies K cold 110
Shredded Wheat N cold 80
Shredded Wheat 'n'Bran N cold 90
Shredded Wheat spoon size N cold 90
Smacks K cold 110
Special K K cold 110
Strawberry Fruit Wheats N cold 90
Total Corn Flakes G cold 110
Total Raisin Bran G cold 140
Total Whole Grain G cold 100
Triples G cold 110
Trix G cold 110
Wheat Chex R cold 100
Wheaties G cold 100
Wheaties Honey Gold G cold 110

You can remove columns from the dataset using a -.

cereal |> 
  select(-rating)

select(): Reordering

You can reorder columns inside of select().

cereal |> 
  select(name, rating, manuf, type, calories, cups, weight,
         everything())
name rating manuf type calories cups weight protein fat sodium fiber carbo sugars potass vitamins shelf
100% Bran 68.40297 N cold 70 0.33 1.00 4 1 130 10.0 5.0 6 280 25 3
100% Natural Bran 33.98368 Q cold 120 1.00 1.00 3 5 15 2.0 8.0 8 135 0 3
All-Bran 59.42551 K cold 70 0.33 1.00 4 1 260 9.0 7.0 5 320 25 3
All-Bran with Extra Fiber 93.70491 K cold 50 0.50 1.00 4 0 140 14.0 8.0 0 330 25 3
Almond Delight 34.38484 R cold 110 0.75 1.00 2 2 200 1.0 14.0 8 -1 25 3
Apple Cinnamon Cheerios 29.50954 G cold 110 0.75 1.00 2 2 180 1.5 10.5 10 70 25 1
Apple Jacks 33.17409 K cold 110 1.00 1.00 2 0 125 1.0 11.0 14 30 25 2
Basic 4 37.03856 G cold 130 0.75 1.33 3 2 210 2.0 18.0 8 100 25 3
Bran Chex 49.12025 R cold 90 0.67 1.00 2 1 200 4.0 15.0 6 125 25 1
Bran Flakes 53.31381 P cold 90 0.67 1.00 3 0 210 5.0 13.0 5 190 25 3
Cap'n'Crunch 18.04285 Q cold 120 0.75 1.00 1 2 220 0.0 12.0 12 35 25 2
Cheerios 50.76500 G cold 110 1.25 1.00 6 2 290 2.0 17.0 1 105 25 1
Cinnamon Toast Crunch 19.82357 G cold 120 0.75 1.00 1 3 210 0.0 13.0 9 45 25 2
Clusters 40.40021 G cold 110 0.50 1.00 3 2 140 2.0 13.0 7 105 25 3
Cocoa Puffs 22.73645 G cold 110 1.00 1.00 1 1 180 0.0 12.0 13 55 25 2
Corn Chex 41.44502 R cold 110 1.00 1.00 2 0 280 0.0 22.0 3 25 25 1
Corn Flakes 45.86332 K cold 100 1.00 1.00 2 0 290 1.0 21.0 2 35 25 1
Corn Pops 35.78279 K cold 110 1.00 1.00 1 0 90 1.0 13.0 12 20 25 2
Count Chocula 22.39651 G cold 110 1.00 1.00 1 1 180 0.0 12.0 13 65 25 2
Cracklin' Oat Bran 40.44877 K cold 110 0.50 1.00 3 3 140 4.0 10.0 7 160 25 3
Cream of Wheat (Quick) 64.53382 N hot 100 1.00 1.00 3 0 80 1.0 21.0 0 -1 0 2
Crispix 46.89564 K cold 110 1.00 1.00 2 0 220 1.0 21.0 3 30 25 3
Crispy Wheat & Raisins 36.17620 G cold 100 0.75 1.00 2 1 140 2.0 11.0 10 120 25 3
Double Chex 44.33086 R cold 100 0.75 1.00 2 0 190 1.0 18.0 5 80 25 3
Froot Loops 32.20758 K cold 110 1.00 1.00 2 1 125 1.0 11.0 13 30 25 2
Frosted Flakes 31.43597 K cold 110 0.75 1.00 1 0 200 1.0 14.0 11 25 25 1
Frosted Mini-Wheats 58.34514 K cold 100 0.80 1.00 3 0 0 3.0 14.0 7 100 25 2
Fruit & Fibre Dates; Walnuts; and Oats 40.91705 P cold 120 0.67 1.25 3 2 160 5.0 12.0 10 200 25 3
Fruitful Bran 41.01549 K cold 120 0.67 1.33 3 0 240 5.0 14.0 12 190 25 3
Fruity Pebbles 28.02576 P cold 110 0.75 1.00 1 1 135 0.0 13.0 12 25 25 2
Golden Crisp 35.25244 P cold 100 0.88 1.00 2 0 45 0.0 11.0 15 40 25 1
Golden Grahams 23.80404 G cold 110 0.75 1.00 1 1 280 0.0 15.0 9 45 25 2
Grape Nuts Flakes 52.07690 P cold 100 0.88 1.00 3 1 140 3.0 15.0 5 85 25 3
Grape-Nuts 53.37101 P cold 110 0.25 1.00 3 0 170 3.0 17.0 3 90 25 3
Great Grains Pecan 45.81172 P cold 120 0.33 1.00 3 3 75 3.0 13.0 4 100 25 3
Honey Graham Ohs 21.87129 Q cold 120 1.00 1.00 1 2 220 1.0 12.0 11 45 25 2
Honey Nut Cheerios 31.07222 G cold 110 0.75 1.00 3 1 250 1.5 11.5 10 90 25 1
Honey-comb 28.74241 P cold 110 1.33 1.00 1 0 180 0.0 14.0 11 35 25 1
Just Right Crunchy Nuggets 36.52368 K cold 110 1.00 1.00 2 1 170 1.0 17.0 6 60 100 3
Just Right Fruit & Nut 36.47151 K cold 140 0.75 1.30 3 1 170 2.0 20.0 9 95 100 3
Kix 39.24111 G cold 110 1.50 1.00 2 1 260 0.0 21.0 3 40 25 2
Life 45.32807 Q cold 100 0.67 1.00 4 2 150 2.0 12.0 6 95 25 2
Lucky Charms 26.73451 G cold 110 1.00 1.00 2 1 180 0.0 12.0 12 55 25 2
Maypo 54.85092 A hot 100 1.00 1.00 4 1 0 0.0 16.0 3 95 25 2
Muesli Raisins; Dates; & Almonds 37.13686 R cold 150 1.00 1.00 4 3 95 3.0 16.0 11 170 25 3
Muesli Raisins; Peaches; & Pecans 34.13976 R cold 150 1.00 1.00 4 3 150 3.0 16.0 11 170 25 3
Mueslix Crispy Blend 30.31335 K cold 160 0.67 1.50 3 2 150 3.0 17.0 13 160 25 3
Multi-Grain Cheerios 40.10596 G cold 100 1.00 1.00 2 1 220 2.0 15.0 6 90 25 1
Nut&Honey Crunch 29.92429 K cold 120 0.67 1.00 2 1 190 0.0 15.0 9 40 25 2
Nutri-Grain Almond-Raisin 40.69232 K cold 140 0.67 1.33 3 2 220 3.0 21.0 7 130 25 3
Nutri-grain Wheat 59.64284 K cold 90 1.00 1.00 3 0 170 3.0 18.0 2 90 25 3
Oatmeal Raisin Crisp 30.45084 G cold 130 0.50 1.25 3 2 170 1.5 13.5 10 120 25 3
Post Nat. Raisin Bran 37.84059 P cold 120 0.67 1.33 3 1 200 6.0 11.0 14 260 25 3
Product 19 41.50354 K cold 100 1.00 1.00 3 0 320 1.0 20.0 3 45 100 3
Puffed Rice 60.75611 Q cold 50 1.00 0.50 1 0 0 0.0 13.0 0 15 0 3
Puffed Wheat 63.00565 Q cold 50 1.00 0.50 2 0 0 1.0 10.0 0 50 0 3
Quaker Oat Squares 49.51187 Q cold 100 0.50 1.00 4 1 135 2.0 14.0 6 110 25 3
Quaker Oatmeal 50.82839 Q hot 100 0.67 1.00 5 2 0 2.7 -1.0 -1 110 0 1
Raisin Bran 39.25920 K cold 120 0.75 1.33 3 1 210 5.0 14.0 12 240 25 2
Raisin Nut Bran 39.70340 G cold 100 0.50 1.00 3 2 140 2.5 10.5 8 140 25 3
Raisin Squares 55.33314 K cold 90 0.50 1.00 2 0 0 2.0 15.0 6 110 25 3
Rice Chex 41.99893 R cold 110 1.13 1.00 1 0 240 0.0 23.0 2 30 25 1
Rice Krispies 40.56016 K cold 110 1.00 1.00 2 0 290 0.0 22.0 3 35 25 1
Shredded Wheat 68.23588 N cold 80 1.00 0.83 2 0 0 3.0 16.0 0 95 0 1
Shredded Wheat 'n'Bran 74.47295 N cold 90 0.67 1.00 3 0 0 4.0 19.0 0 140 0 1
Shredded Wheat spoon size 72.80179 N cold 90 0.67 1.00 3 0 0 3.0 20.0 0 120 0 1
Smacks 31.23005 K cold 110 0.75 1.00 2 1 70 1.0 9.0 15 40 25 2
Special K 53.13132 K cold 110 1.00 1.00 6 0 230 1.0 16.0 3 55 25 1
Strawberry Fruit Wheats 59.36399 N cold 90 1.00 1.00 2 0 15 3.0 15.0 5 90 25 2
Total Corn Flakes 38.83975 G cold 110 1.00 1.00 2 1 200 0.0 21.0 3 35 100 3
Total Raisin Bran 28.59278 G cold 140 1.00 1.50 3 1 190 4.0 15.0 14 230 100 3
Total Whole Grain 46.65884 G cold 100 1.00 1.00 3 1 200 3.0 16.0 3 110 100 3
Triples 39.10617 G cold 110 0.75 1.00 2 1 250 0.0 21.0 3 60 25 3
Trix 27.75330 G cold 110 1.00 1.00 1 1 140 0.0 13.0 12 25 25 2
Wheat Chex 49.78744 R cold 100 0.67 1.00 3 1 230 3.0 17.0 3 115 25 1
Wheaties 51.59219 G cold 100 1.00 1.00 3 1 200 3.0 17.0 3 110 25 1
Wheaties Honey Gold 36.18756 G cold 110 0.75 1.00 2 1 200 1.0 16.0 8 60 25 1

select(): Handy Helpers!

  • everything() – selects all columns that you have not already specified
  • starts_with() – selects columns with names that start with the specified string
  • ends_with() – selects columns with names that end with the specified string
  • contains() – selects columns with names that contain the specified string

rename()

  • You can rename columns with select(), but all columns not specified will be dropped.
    • Using the rename() function is easier!
cereal |> 
  rename(temp = type)
name manuf temp calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354
Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756

How do we “select” in base R?

You don’t really use a specific function!

cereal[,c("name", "manuf", "calories", "cups")]
name manuf calories cups
100% Bran N 70 0.33
100% Natural Bran Q 120 1.00
All-Bran K 70 0.33
All-Bran with Extra Fiber K 50 0.50
Almond Delight R 110 0.75
Apple Cinnamon Cheerios G 110 0.75
Apple Jacks K 110 1.00
Basic 4 G 130 0.75
Bran Chex R 90 0.67
Bran Flakes P 90 0.67
Cap'n'Crunch Q 120 0.75
Cheerios G 110 1.25
Cinnamon Toast Crunch G 120 0.75
Clusters G 110 0.50
Cocoa Puffs G 110 1.00
Corn Chex R 110 1.00
Corn Flakes K 100 1.00
Corn Pops K 110 1.00
Count Chocula G 110 1.00
Cracklin' Oat Bran K 110 0.50
Cream of Wheat (Quick) N 100 1.00
Crispix K 110 1.00
Crispy Wheat & Raisins G 100 0.75
Double Chex R 100 0.75
Froot Loops K 110 1.00
Frosted Flakes K 110 0.75
Frosted Mini-Wheats K 100 0.80
Fruit & Fibre Dates; Walnuts; and Oats P 120 0.67
Fruitful Bran K 120 0.67
Fruity Pebbles P 110 0.75
Golden Crisp P 100 0.88
Golden Grahams G 110 0.75
Grape Nuts Flakes P 100 0.88
Grape-Nuts P 110 0.25
Great Grains Pecan P 120 0.33
Honey Graham Ohs Q 120 1.00
Honey Nut Cheerios G 110 0.75
Honey-comb P 110 1.33
Just Right Crunchy Nuggets K 110 1.00
Just Right Fruit & Nut K 140 0.75
Kix G 110 1.50
Life Q 100 0.67
Lucky Charms G 110 1.00
Maypo A 100 1.00
Muesli Raisins; Dates; & Almonds R 150 1.00
Muesli Raisins; Peaches; & Pecans R 150 1.00
Mueslix Crispy Blend K 160 0.67
Multi-Grain Cheerios G 100 1.00
Nut&Honey Crunch K 120 0.67
Nutri-Grain Almond-Raisin K 140 0.67
Nutri-grain Wheat K 90 1.00
Oatmeal Raisin Crisp G 130 0.50
Post Nat. Raisin Bran P 120 0.67
Product 19 K 100 1.00
Puffed Rice Q 50 1.00
Puffed Wheat Q 50 1.00
Quaker Oat Squares Q 100 0.50
Quaker Oatmeal Q 100 0.67
Raisin Bran K 120 0.75
Raisin Nut Bran G 100 0.50
Raisin Squares K 90 0.50
Rice Chex R 110 1.13
Rice Krispies K 110 1.00
Shredded Wheat N 80 1.00
Shredded Wheat 'n'Bran N 90 0.67
Shredded Wheat spoon size N 90 0.67
Smacks K 110 0.75
Special K K 110 1.00
Strawberry Fruit Wheats N 90 1.00
Total Corn Flakes G 110 1.00
Total Raisin Bran G 140 1.00
Total Whole Grain G 100 1.00
Triples G 110 0.75
Trix G 110 1.00
Wheat Chex R 100 0.67
Wheaties G 100 1.00
Wheaties Honey Gold G 110 0.75
cereal |> 
  subset(select = -c(rating))


colnames(cereal)[2:4] <- c("maker","temp","cals")

mutate()

Mutate (by Allison Horst)

mutate()

The data set gets mutated to either include a new variable

cereal |> 
  mutate(cal_per_cup = calories / cups)
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating cal_per_cup
100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297 212.12121
100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368 120.00000
All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551 212.12121
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491 100.00000
Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484 146.66667
Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954 146.66667
Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409 110.00000
Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856 173.33333
Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025 134.32836
Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381 134.32836
Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285 160.00000
Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500 88.00000
Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357 160.00000
Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021 220.00000
Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645 110.00000
Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502 110.00000
Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332 100.00000
Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279 110.00000
Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651 110.00000
Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877 220.00000
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382 100.00000
Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564 110.00000
Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620 133.33333
Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086 133.33333
Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758 110.00000
Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597 146.66667
Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514 125.00000
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705 179.10448
Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549 179.10448
Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576 146.66667
Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244 113.63636
Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404 146.66667
Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690 113.63636
Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101 440.00000
Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172 363.63636
Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129 120.00000
Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222 146.66667
Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241 82.70677
Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368 110.00000
Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151 186.66667
Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111 73.33333
Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807 149.25373
Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451 110.00000
Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092 100.00000
Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686 150.00000
Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976 150.00000
Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335 238.80597
Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596 100.00000
Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429 179.10448
Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232 208.95522
Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284 90.00000
Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084 260.00000
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059 179.10448
Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354 100.00000
Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611 50.00000
Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565 50.00000
Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187 200.00000
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839 149.25373
Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920 160.00000
Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340 200.00000
Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314 180.00000
Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893 97.34513
Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016 110.00000
Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588 80.00000
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295 134.32836
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179 134.32836
Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005 146.66667
Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132 110.00000
Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399 90.00000
Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975 110.00000
Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278 140.00000
Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884 100.00000
Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617 146.66667
Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330 110.00000
Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744 149.25373
Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219 100.00000
Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756 146.66667

…OR revise an existing variable.

cereal |> 
  mutate(shelf = as.factor(shelf))

mutate(): Handy Helpers!

  • if_else() or case_when() – shortcut for if-else loop
  • as.factor(), as.numeric(), etc. – change variable type
  • +, -, *, / – basic mathematical operations
  • %% – modulo (returns the remainder when doing division)

How do we “mutate” in base R?

You can define new columns…

cereal$cal_per_cup <- cereal$calories / cereal$cups
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating cal_per_cup
100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297 212.12121
100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368 120.00000
All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551 212.12121
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491 100.00000
Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484 146.66667
Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954 146.66667
Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409 110.00000
Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856 173.33333
Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025 134.32836
Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381 134.32836
Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285 160.00000
Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500 88.00000
Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357 160.00000
Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021 220.00000
Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645 110.00000
Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502 110.00000
Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332 100.00000
Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279 110.00000
Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651 110.00000
Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877 220.00000
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382 100.00000
Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564 110.00000
Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620 133.33333
Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086 133.33333
Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758 110.00000
Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597 146.66667
Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514 125.00000
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705 179.10448
Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549 179.10448
Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576 146.66667
Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244 113.63636
Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404 146.66667
Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690 113.63636
Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101 440.00000
Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172 363.63636
Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129 120.00000
Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222 146.66667
Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241 82.70677
Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368 110.00000
Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151 186.66667
Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111 73.33333
Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807 149.25373
Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451 110.00000
Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092 100.00000
Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686 150.00000
Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976 150.00000
Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335 238.80597
Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596 100.00000
Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429 179.10448
Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232 208.95522
Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284 90.00000
Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084 260.00000
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059 179.10448
Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354 100.00000
Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611 50.00000
Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565 50.00000
Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187 200.00000
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839 149.25373
Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920 160.00000
Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340 200.00000
Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314 180.00000
Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893 97.34513
Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016 110.00000
Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588 80.00000
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295 134.32836
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179 134.32836
Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005 146.66667
Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132 110.00000
Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399 90.00000
Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975 110.00000
Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278 140.00000
Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884 100.00000
Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617 146.66667
Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330 110.00000
Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744 149.25373
Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219 100.00000
Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756 146.66667

…OR overwrite old ones!

cereal$shelf <- as.factor(cereal$shelf)

group_by()

The ungroup() command can be just as important as the group_by() command! (by Allison Horst)

group_by()

Separate the data into different groups based on a categorical variable.

  • The data gets grouped, but nothing happens externally.
cereal |> 
  group_by(type)
# A tibble: 77 × 16
# Groups:   type [2]
   name      manuf type  calories protein   fat sodium fiber carbo sugars potass
   <fct>     <fct> <fct>    <int>   <int> <int>  <int> <dbl> <dbl>  <int>  <int>
 1 100% Bran N     cold        70       4     1    130  10     5        6    280
 2 100% Nat… Q     cold       120       3     5     15   2     8        8    135
 3 All-Bran  K     cold        70       4     1    260   9     7        5    320
 4 All-Bran… K     cold        50       4     0    140  14     8        0    330
 5 Almond D… R     cold       110       2     2    200   1    14        8     -1
 6 Apple Ci… G     cold       110       2     2    180   1.5  10.5     10     70
 7 Apple Ja… K     cold       110       2     0    125   1    11       14     30
 8 Basic 4   G     cold       130       3     2    210   2    18        8    100
 9 Bran Chex R     cold        90       2     1    200   4    15        6    125
10 Bran Fla… P     cold        90       3     0    210   5    13        5    190
# ℹ 67 more rows
# ℹ 5 more variables: vitamins <int>, shelf <int>, weight <dbl>, cups <dbl>,
#   rating <dbl>
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354
Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756

ungroup()

The ungroup() function will remove the internal grouping in your data.

  • This is not something that you typically need to do, but if you are getting weird errors downstream from a group_by() statement, try ungrouping your data!

summarize()

group_by() is almost always paired with summarize()!

summarize()

We can calculate summaries of variables in the data.

cereal |> 
  summarise(mean_calories = mean(fiber))
  mean_calories
1      2.151948

Or multiple summaries at the same time.

cereal |> 
summarise(mean_calories = mean(fiber),
          num_cereals = n(),
          mean_sugar = mean(sugars))
  mean_calories num_cereals mean_sugar
1      2.151948          77   6.922078

Note

summarize() and summarise() are synonyms!

summarize(): Handy Helpers!

  • mean(), median(), sd(), sum()
  • min(), max()
  • n(), n_distinct() – counts the number of (distinct) elements
  • first(), last(), nth() – extract the first, last, or nth element
  • across() – apply a function across columns

group_by() + summarize()!

  1. group_by a variable (or multiple variables)
  2. summarize a variable (or multiple variables) within the groups
cereal |> 
  group_by(manuf) |> 
  summarise(mean_sugar = mean(sugars))
manuf mean_sugar
A 3.000000
G 7.954546
K 7.565217
N 1.833333
P 8.777778
Q 5.250000
R 6.125000

group_by() + mutate()!

  1. group_by a variable (or multiple variables)
  2. mutate a variable (or multiple variables) within the groups
cereal |> 
  group_by(manuf) |> 
  mutate(mean_sugar = mean(sugars))
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating mean_sugar
100% Bran N cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297 1.833333
100% Natural Bran Q cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368 5.250000
All-Bran K cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551 7.565217
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491 7.565217
Almond Delight R cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484 6.125000
Apple Cinnamon Cheerios G cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954 7.954546
Apple Jacks K cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409 7.565217
Basic 4 G cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856 7.954546
Bran Chex R cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025 6.125000
Bran Flakes P cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381 8.777778
Cap'n'Crunch Q cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285 5.250000
Cheerios G cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500 7.954546
Cinnamon Toast Crunch G cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357 7.954546
Clusters G cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021 7.954546
Cocoa Puffs G cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645 7.954546
Corn Chex R cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502 6.125000
Corn Flakes K cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332 7.565217
Corn Pops K cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279 7.565217
Count Chocula G cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651 7.954546
Cracklin' Oat Bran K cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877 7.565217
Cream of Wheat (Quick) N hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382 1.833333
Crispix K cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564 7.565217
Crispy Wheat & Raisins G cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620 7.954546
Double Chex R cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086 6.125000
Froot Loops K cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758 7.565217
Frosted Flakes K cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597 7.565217
Frosted Mini-Wheats K cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514 7.565217
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705 8.777778
Fruitful Bran K cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549 7.565217
Fruity Pebbles P cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576 8.777778
Golden Crisp P cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244 8.777778
Golden Grahams G cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404 7.954546
Grape Nuts Flakes P cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690 8.777778
Grape-Nuts P cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101 8.777778
Great Grains Pecan P cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172 8.777778
Honey Graham Ohs Q cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129 5.250000
Honey Nut Cheerios G cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222 7.954546
Honey-comb P cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241 8.777778
Just Right Crunchy Nuggets K cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368 7.565217
Just Right Fruit & Nut K cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151 7.565217
Kix G cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111 7.954546
Life Q cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807 5.250000
Lucky Charms G cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451 7.954546
Maypo A hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092 3.000000
Muesli Raisins; Dates; & Almonds R cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686 6.125000
Muesli Raisins; Peaches; & Pecans R cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976 6.125000
Mueslix Crispy Blend K cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335 7.565217
Multi-Grain Cheerios G cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596 7.954546
Nut&Honey Crunch K cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429 7.565217
Nutri-Grain Almond-Raisin K cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232 7.565217
Nutri-grain Wheat K cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284 7.565217
Oatmeal Raisin Crisp G cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084 7.954546
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059 8.777778
Product 19 K cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354 7.565217
Puffed Rice Q cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611 5.250000
Puffed Wheat Q cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565 5.250000
Quaker Oat Squares Q cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187 5.250000
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839 5.250000
Raisin Bran K cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920 7.565217
Raisin Nut Bran G cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340 7.954546
Raisin Squares K cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314 7.565217
Rice Chex R cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893 6.125000
Rice Krispies K cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016 7.565217
Shredded Wheat N cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588 1.833333
Shredded Wheat 'n'Bran N cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295 1.833333
Shredded Wheat spoon size N cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179 1.833333
Smacks K cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005 7.565217
Special K K cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132 7.565217
Strawberry Fruit Wheats N cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399 1.833333
Total Corn Flakes G cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975 7.954546
Total Raisin Bran G cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278 7.954546
Total Whole Grain G cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884 7.954546
Triples G cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617 7.954546
Trix G cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330 7.954546
Wheat Chex R cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744 6.125000
Wheaties G cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219 7.954546
Wheaties Honey Gold G cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756 7.954546

The Difference?

group_by() + summarize() collapses the data.

  • You will only have one row per group remaining.


group_by() + mutate() does not.

  • You will have the full number of rows remaining.

How do we “group” and “summarize” in base R?

You can use the aggregate() function.

cereal |> 
  aggregate(sugars ~ manuf, FUN = mean)
manuf sugars
A 3.000000
G 7.954546
K 7.565217
N 1.833333
P 8.777778
Q 5.250000
R 6.125000

Glue it all together!

cereal |> 
  filter(type == "cold") |> 
  mutate(cal_per_cup = calories / cups) |> 
  group_by(manuf) |> 
  summarise(mean_cal_per_cup = mean(cal_per_cup))
manuf mean_cal_per_cup
G 137.7879
K 145.3518
N 130.1556
P 194.7578
Q 121.3220
R 133.8659

Save your changes!

When you manipulate your data, make sure you assign your new dataset to a variable.

cereal_summary <- cereal |> 
  filter(type == "cold") |> 
  mutate(cal_per_cup = calories / cups) |> 
  group_by(manuf) |> 
  summarise(mean_cal_per_cup = mean(cal_per_cup))

Code Formatting

Similar to the + formatting in ggplot, do not continue a line after writing a |>!

cereal |> group_by(type) |> summarise(mean_calories = mean(calories), num_cereals = n(), mean_sugar = mean(sugars))
cereal |> 
  group_by(type) |> 
  summarise(mean_calories = mean(calories), 
            num_cereals = n(),
            mean_sugar = mean(sugars))

PA 3: Identify the Mystery College

Today you will use the dplyr package to clean some data and then use that cleaned data to figure out what college Ephelia has been accepted to.

Submit the full name of the college Ephelia will attend to the Canvas Quiz.

To do…

  • PA 3: Identify the Mystery College
    • Due Wednesday 4/19 at 10:00am

Wednesday, April 19

Today we will…

  • Review Lab 2
  • Using External Resources
  • Extend dplyr verbs:
    • across()
    • if_else()
    • case_when()
  • Thinking about Data Ethics
  • Lab 3: Familiarity with AAE
  • Challenge 3: Demographic Comparisons & Data Ethics

Lab 2

Using Outside Resources

Citing Your Sources

When you write code, you will need to reference function/package documentation and external resources.

  • This is part of being a programmer!

When you rely on external resources for an assignment in this course, you must cite your sources.

  • If you use any resources outside of the course text, the course slides, and the posted cheatsheets, you must include a citation!
  • You lose points if you do not.

ChatGPT

Extending dplyr verbs

Example Data set – Cereal

library(liver)
data(cereal)
head(cereal)
                       name manuf type calories protein fat sodium fiber carbo
1                 100% Bran     N cold       70       4   1    130  10.0   5.0
2         100% Natural Bran     Q cold      120       3   5     15   2.0   8.0
3                  All-Bran     K cold       70       4   1    260   9.0   7.0
4 All-Bran with Extra Fiber     K cold       50       4   0    140  14.0   8.0
5            Almond Delight     R cold      110       2   2    200   1.0  14.0
6   Apple Cinnamon Cheerios     G cold      110       2   2    180   1.5  10.5
  sugars potass vitamins shelf weight cups   rating
1      6    280       25     3      1 0.33 68.40297
2      8    135        0     3      1 1.00 33.98368
3      5    320       25     3      1 0.33 59.42551
4      0    330       25     3      1 0.50 93.70491
5      8     -1       25     3      1 0.75 34.38484
6     10     70       25     1      1 0.75 29.50954

Count with count()

How many cereals does each manuf have in this dataset?

cereal |> 
  group_by(manuf) |> 
  count()
# A tibble: 7 × 2
# Groups:   manuf [7]
  manuf     n
  <fct> <int>
1 A         1
2 G        22
3 K        23
4 N         6
5 P         9
6 Q         8
7 R         8

Summarize multiple columns with across()

For each type of cereal, calculate the mean nutrient levels.

  • .cols – specify the columns to apply functions to
  • .fns – specify the functions to apply
cereal |> 
  group_by(type) |> 
  summarise(across(.cols = calories:potass, .fns = mean))
# A tibble: 2 × 9
  type  calories protein   fat sodium fiber carbo sugars potass
  <fct>    <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>  <dbl>  <dbl>
1 cold      107.    2.49  1.01  165.   2.19  14.7  7.18    97.2
2 hot       100     4     1      26.7  1.23  12    0.667   68  

Discretize with if_else()

For each cereal, label the calories as “high” or “low”.

One if-else statement:

  • if_else(<CONDITION>, <TRUE OUTPUT>, <FALSE OUTPUT>)

cereal |> 
  mutate(cal_category = if_else(calories <= 100, "low", "high"),
         .after = calories)
name manuf type calories cal_category protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
100% Bran N cold 70 low 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
100% Natural Bran Q cold 120 high 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
All-Bran K cold 70 low 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
All-Bran with Extra Fiber K cold 50 low 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
Almond Delight R cold 110 high 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
Apple Cinnamon Cheerios G cold 110 high 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
Apple Jacks K cold 110 high 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
Basic 4 G cold 130 high 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
Bran Chex R cold 90 low 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
Bran Flakes P cold 90 low 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
Cap'n'Crunch Q cold 120 high 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
Cheerios G cold 110 high 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
Cinnamon Toast Crunch G cold 120 high 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
Clusters G cold 110 high 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
Cocoa Puffs G cold 110 high 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
Corn Chex R cold 110 high 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
Corn Flakes K cold 100 low 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
Corn Pops K cold 110 high 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
Count Chocula G cold 110 high 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
Cracklin' Oat Bran K cold 110 high 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
Cream of Wheat (Quick) N hot 100 low 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
Crispix K cold 110 high 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
Crispy Wheat & Raisins G cold 100 low 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
Double Chex R cold 100 low 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
Froot Loops K cold 110 high 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
Frosted Flakes K cold 110 high 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
Frosted Mini-Wheats K cold 100 low 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
Fruit & Fibre Dates; Walnuts; and Oats P cold 120 high 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
Fruitful Bran K cold 120 high 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
Fruity Pebbles P cold 110 high 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
Golden Crisp P cold 100 low 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
Golden Grahams G cold 110 high 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
Grape Nuts Flakes P cold 100 low 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
Grape-Nuts P cold 110 high 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
Great Grains Pecan P cold 120 high 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
Honey Graham Ohs Q cold 120 high 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
Honey Nut Cheerios G cold 110 high 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
Honey-comb P cold 110 high 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
Just Right Crunchy Nuggets K cold 110 high 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
Just Right Fruit & Nut K cold 140 high 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
Kix G cold 110 high 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
Life Q cold 100 low 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
Lucky Charms G cold 110 high 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
Maypo A hot 100 low 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
Muesli Raisins; Dates; & Almonds R cold 150 high 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
Muesli Raisins; Peaches; & Pecans R cold 150 high 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
Mueslix Crispy Blend K cold 160 high 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
Multi-Grain Cheerios G cold 100 low 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
Nut&Honey Crunch K cold 120 high 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
Nutri-Grain Almond-Raisin K cold 140 high 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
Nutri-grain Wheat K cold 90 low 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
Oatmeal Raisin Crisp G cold 130 high 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
Post Nat. Raisin Bran P cold 120 high 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
Product 19 K cold 100 low 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354
Puffed Rice Q cold 50 low 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
Puffed Wheat Q cold 50 low 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
Quaker Oat Squares Q cold 100 low 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
Quaker Oatmeal Q hot 100 low 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
Raisin Bran K cold 120 high 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
Raisin Nut Bran G cold 100 low 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
Raisin Squares K cold 90 low 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
Rice Chex R cold 110 high 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
Rice Krispies K cold 110 high 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
Shredded Wheat N cold 80 low 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
Shredded Wheat 'n'Bran N cold 90 low 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
Shredded Wheat spoon size N cold 90 low 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
Smacks K cold 110 high 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
Special K K cold 110 high 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
Strawberry Fruit Wheats N cold 90 low 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
Total Corn Flakes G cold 110 high 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
Total Raisin Bran G cold 140 high 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
Total Whole Grain G cold 100 low 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
Triples G cold 110 high 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
Trix G cold 110 high 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
Wheat Chex R cold 100 low 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
Wheaties G cold 100 low 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
Wheaties Honey Gold G cold 110 high 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756

.after – specifies the location of the newly created column

Re-level with case_when()

For each manufacturer, change the manuf code to the name.

A series of if-else statements.

cereal |> 
  mutate(manuf = case_when(manuf == "A" ~ "American Home Food Products", 
                           manuf == "G" ~ "General Mills", 
                           manuf == "K" ~ "Kelloggs", 
                           manuf == "N" ~ "Nabisco", 
                           manuf == "P" ~ "Post", 
                           manuf == "Q" ~ "Quaker Oats", 
                           manuf == "R" ~ "Ralston Purina"))
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
100% Bran Nabisco cold 70 4 1 130 10.0 5.0 6 280 25 3 1.00 0.33 68.40297
100% Natural Bran Quaker Oats cold 120 3 5 15 2.0 8.0 8 135 0 3 1.00 1.00 33.98368
All-Bran Kelloggs cold 70 4 1 260 9.0 7.0 5 320 25 3 1.00 0.33 59.42551
All-Bran with Extra Fiber Kelloggs cold 50 4 0 140 14.0 8.0 0 330 25 3 1.00 0.50 93.70491
Almond Delight Ralston Purina cold 110 2 2 200 1.0 14.0 8 -1 25 3 1.00 0.75 34.38484
Apple Cinnamon Cheerios General Mills cold 110 2 2 180 1.5 10.5 10 70 25 1 1.00 0.75 29.50954
Apple Jacks Kelloggs cold 110 2 0 125 1.0 11.0 14 30 25 2 1.00 1.00 33.17409
Basic 4 General Mills cold 130 3 2 210 2.0 18.0 8 100 25 3 1.33 0.75 37.03856
Bran Chex Ralston Purina cold 90 2 1 200 4.0 15.0 6 125 25 1 1.00 0.67 49.12025
Bran Flakes Post cold 90 3 0 210 5.0 13.0 5 190 25 3 1.00 0.67 53.31381
Cap'n'Crunch Quaker Oats cold 120 1 2 220 0.0 12.0 12 35 25 2 1.00 0.75 18.04285
Cheerios General Mills cold 110 6 2 290 2.0 17.0 1 105 25 1 1.00 1.25 50.76500
Cinnamon Toast Crunch General Mills cold 120 1 3 210 0.0 13.0 9 45 25 2 1.00 0.75 19.82357
Clusters General Mills cold 110 3 2 140 2.0 13.0 7 105 25 3 1.00 0.50 40.40021
Cocoa Puffs General Mills cold 110 1 1 180 0.0 12.0 13 55 25 2 1.00 1.00 22.73645
Corn Chex Ralston Purina cold 110 2 0 280 0.0 22.0 3 25 25 1 1.00 1.00 41.44502
Corn Flakes Kelloggs cold 100 2 0 290 1.0 21.0 2 35 25 1 1.00 1.00 45.86332
Corn Pops Kelloggs cold 110 1 0 90 1.0 13.0 12 20 25 2 1.00 1.00 35.78279
Count Chocula General Mills cold 110 1 1 180 0.0 12.0 13 65 25 2 1.00 1.00 22.39651
Cracklin' Oat Bran Kelloggs cold 110 3 3 140 4.0 10.0 7 160 25 3 1.00 0.50 40.44877
Cream of Wheat (Quick) Nabisco hot 100 3 0 80 1.0 21.0 0 -1 0 2 1.00 1.00 64.53382
Crispix Kelloggs cold 110 2 0 220 1.0 21.0 3 30 25 3 1.00 1.00 46.89564
Crispy Wheat & Raisins General Mills cold 100 2 1 140 2.0 11.0 10 120 25 3 1.00 0.75 36.17620
Double Chex Ralston Purina cold 100 2 0 190 1.0 18.0 5 80 25 3 1.00 0.75 44.33086
Froot Loops Kelloggs cold 110 2 1 125 1.0 11.0 13 30 25 2 1.00 1.00 32.20758
Frosted Flakes Kelloggs cold 110 1 0 200 1.0 14.0 11 25 25 1 1.00 0.75 31.43597
Frosted Mini-Wheats Kelloggs cold 100 3 0 0 3.0 14.0 7 100 25 2 1.00 0.80 58.34514
Fruit & Fibre Dates; Walnuts; and Oats Post cold 120 3 2 160 5.0 12.0 10 200 25 3 1.25 0.67 40.91705
Fruitful Bran Kelloggs cold 120 3 0 240 5.0 14.0 12 190 25 3 1.33 0.67 41.01549
Fruity Pebbles Post cold 110 1 1 135 0.0 13.0 12 25 25 2 1.00 0.75 28.02576
Golden Crisp Post cold 100 2 0 45 0.0 11.0 15 40 25 1 1.00 0.88 35.25244
Golden Grahams General Mills cold 110 1 1 280 0.0 15.0 9 45 25 2 1.00 0.75 23.80404
Grape Nuts Flakes Post cold 100 3 1 140 3.0 15.0 5 85 25 3 1.00 0.88 52.07690
Grape-Nuts Post cold 110 3 0 170 3.0 17.0 3 90 25 3 1.00 0.25 53.37101
Great Grains Pecan Post cold 120 3 3 75 3.0 13.0 4 100 25 3 1.00 0.33 45.81172
Honey Graham Ohs Quaker Oats cold 120 1 2 220 1.0 12.0 11 45 25 2 1.00 1.00 21.87129
Honey Nut Cheerios General Mills cold 110 3 1 250 1.5 11.5 10 90 25 1 1.00 0.75 31.07222
Honey-comb Post cold 110 1 0 180 0.0 14.0 11 35 25 1 1.00 1.33 28.74241
Just Right Crunchy Nuggets Kelloggs cold 110 2 1 170 1.0 17.0 6 60 100 3 1.00 1.00 36.52368
Just Right Fruit & Nut Kelloggs cold 140 3 1 170 2.0 20.0 9 95 100 3 1.30 0.75 36.47151
Kix General Mills cold 110 2 1 260 0.0 21.0 3 40 25 2 1.00 1.50 39.24111
Life Quaker Oats cold 100 4 2 150 2.0 12.0 6 95 25 2 1.00 0.67 45.32807
Lucky Charms General Mills cold 110 2 1 180 0.0 12.0 12 55 25 2 1.00 1.00 26.73451
Maypo American Home Food Products hot 100 4 1 0 0.0 16.0 3 95 25 2 1.00 1.00 54.85092
Muesli Raisins; Dates; & Almonds Ralston Purina cold 150 4 3 95 3.0 16.0 11 170 25 3 1.00 1.00 37.13686
Muesli Raisins; Peaches; & Pecans Ralston Purina cold 150 4 3 150 3.0 16.0 11 170 25 3 1.00 1.00 34.13976
Mueslix Crispy Blend Kelloggs cold 160 3 2 150 3.0 17.0 13 160 25 3 1.50 0.67 30.31335
Multi-Grain Cheerios General Mills cold 100 2 1 220 2.0 15.0 6 90 25 1 1.00 1.00 40.10596
Nut&Honey Crunch Kelloggs cold 120 2 1 190 0.0 15.0 9 40 25 2 1.00 0.67 29.92429
Nutri-Grain Almond-Raisin Kelloggs cold 140 3 2 220 3.0 21.0 7 130 25 3 1.33 0.67 40.69232
Nutri-grain Wheat Kelloggs cold 90 3 0 170 3.0 18.0 2 90 25 3 1.00 1.00 59.64284
Oatmeal Raisin Crisp General Mills cold 130 3 2 170 1.5 13.5 10 120 25 3 1.25 0.50 30.45084
Post Nat. Raisin Bran Post cold 120 3 1 200 6.0 11.0 14 260 25 3 1.33 0.67 37.84059
Product 19 Kelloggs cold 100 3 0 320 1.0 20.0 3 45 100 3 1.00 1.00 41.50354
Puffed Rice Quaker Oats cold 50 1 0 0 0.0 13.0 0 15 0 3 0.50 1.00 60.75611
Puffed Wheat Quaker Oats cold 50 2 0 0 1.0 10.0 0 50 0 3 0.50 1.00 63.00565
Quaker Oat Squares Quaker Oats cold 100 4 1 135 2.0 14.0 6 110 25 3 1.00 0.50 49.51187
Quaker Oatmeal Quaker Oats hot 100 5 2 0 2.7 -1.0 -1 110 0 1 1.00 0.67 50.82839
Raisin Bran Kelloggs cold 120 3 1 210 5.0 14.0 12 240 25 2 1.33 0.75 39.25920
Raisin Nut Bran General Mills cold 100 3 2 140 2.5 10.5 8 140 25 3 1.00 0.50 39.70340
Raisin Squares Kelloggs cold 90 2 0 0 2.0 15.0 6 110 25 3 1.00 0.50 55.33314
Rice Chex Ralston Purina cold 110 1 0 240 0.0 23.0 2 30 25 1 1.00 1.13 41.99893
Rice Krispies Kelloggs cold 110 2 0 290 0.0 22.0 3 35 25 1 1.00 1.00 40.56016
Shredded Wheat Nabisco cold 80 2 0 0 3.0 16.0 0 95 0 1 0.83 1.00 68.23588
Shredded Wheat 'n'Bran Nabisco cold 90 3 0 0 4.0 19.0 0 140 0 1 1.00 0.67 74.47295
Shredded Wheat spoon size Nabisco cold 90 3 0 0 3.0 20.0 0 120 0 1 1.00 0.67 72.80179
Smacks Kelloggs cold 110 2 1 70 1.0 9.0 15 40 25 2 1.00 0.75 31.23005
Special K Kelloggs cold 110 6 0 230 1.0 16.0 3 55 25 1 1.00 1.00 53.13132
Strawberry Fruit Wheats Nabisco cold 90 2 0 15 3.0 15.0 5 90 25 2 1.00 1.00 59.36399
Total Corn Flakes General Mills cold 110 2 1 200 0.0 21.0 3 35 100 3 1.00 1.00 38.83975
Total Raisin Bran General Mills cold 140 3 1 190 4.0 15.0 14 230 100 3 1.50 1.00 28.59278
Total Whole Grain General Mills cold 100 3 1 200 3.0 16.0 3 110 100 3 1.00 1.00 46.65884
Triples General Mills cold 110 2 1 250 0.0 21.0 3 60 25 3 1.00 0.75 39.10617
Trix General Mills cold 110 1 1 140 0.0 13.0 12 25 25 2 1.00 1.00 27.75330
Wheat Chex Ralston Purina cold 100 3 1 230 3.0 17.0 3 115 25 1 1.00 0.67 49.78744
Wheaties General Mills cold 100 3 1 200 3.0 17.0 3 110 25 1 1.00 1.00 51.59219
Wheaties Honey Gold General Mills cold 110 2 1 200 1.0 16.0 8 60 25 1 1.00 0.75 36.18756

group_by() + slice()

For each manuf, find the cereal with the most fiber.

cereal |> 
  group_by(manuf) |> 
  slice_max(order_by = fiber)
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
Maypo A hot 100 4 1 0 0.0 16 3 95 25 2 1.00 1.00 54.85092
Total Raisin Bran G cold 140 3 1 190 4.0 15 14 230 100 3 1.50 1.00 28.59278
All-Bran with Extra Fiber K cold 50 4 0 140 14.0 8 0 330 25 3 1.00 0.50 93.70491
100% Bran N cold 70 4 1 130 10.0 5 6 280 25 3 1.00 0.33 68.40297
Post Nat. Raisin Bran P cold 120 3 1 200 6.0 11 14 260 25 3 1.33 0.67 37.84059
Quaker Oatmeal Q hot 100 5 2 0 2.7 -1 -1 110 0 1 1.00 0.67 50.82839
Bran Chex R cold 90 2 1 200 4.0 15 6 125 25 1 1.00 0.67 49.12025

Multiple Variables in slice()

Find the 3 cereals with the lowest calories and sugars.

  • If you are ordering by multiple variables, wrap them in a data.frame!
cereal |> 
  slice_min(order_by = data.frame(calories, sugars),
            n = 3)
name manuf type calories protein fat sodium fiber carbo sugars potass vitamins shelf weight cups rating
All-Bran with Extra Fiber K cold 50 4 0 140 14 8 0 330 25 3 1.0 0.5 93.70491
Puffed Rice Q cold 50 1 0 0 0 13 0 15 0 3 0.5 1.0 60.75611
Puffed Wheat Q cold 50 2 0 0 1 10 0 50 0 3 0.5 1.0 63.00565

Piping into ggplot()

Plot the mean calories per cup for each manuf.

cereal |> 
  mutate(manuf = case_when(manuf == "A" ~ "American Home Food Products", 
                           manuf == "G" ~ "General Mills", 
                           manuf == "K" ~ "Kelloggs", 
                           manuf == "N" ~ "Nabisco", 
                           manuf == "P" ~ "Post", 
                           manuf == "Q" ~ "Quaker Oats", 
                           manuf == "R" ~ "Ralston Purina")) |> 
  filter(type == "cold") |> 
  mutate(cal_per_cup = calories / cups) |> 
  group_by(manuf) |> 
  summarise(mean_cal_per_cup = mean(cal_per_cup)) |> 
  ggplot(aes(x = manuf, 
             y = mean_cal_per_cup, 
             shape = manuf)) +
  geom_point(show.legend = FALSE,
             size = 6) +
  labs(x = "Manufacturer",
       subtitle = "Mean Calories per Cup") +
  theme_bw() +
  theme(axis.title.y = element_blank(),
        axis.title.x  = element_text(size = 30),
        plot.subtitle = element_text(size = 32),
        axis.text = element_text(size = 22),
        axis.text.x = element_text(angle = 13)) +
  scale_y_continuous(limits = c(75,225))

Piping into ggplot()

Plot the mean calories per cup for each manuf.

cereal |> 
  mutate(manuf = case_when(manuf == "A" ~ "American Home Food Products", 
                           manuf == "G" ~ "General Mills", 
                           manuf == "K" ~ "Kelloggs", 
                           manuf == "N" ~ "Nabisco", 
                           manuf == "P" ~ "Post", 
                           manuf == "Q" ~ "Quaker Oats", 
                           manuf == "R" ~ "Ralston Purina")) |> 
  filter(type == "cold") |> 
  mutate(cal_per_cup = calories / cups) |> 
  group_by(manuf) |> 
  summarise(mean_cal_per_cup = mean(cal_per_cup)) |> 
  ggplot(aes(x = manuf, 
             y = mean_cal_per_cup, 
             shape = manuf)) +
  geom_point(show.legend = FALSE,
             size = 6) +
  labs(x = "Manufacturer",
       subtitle = "Mean Calories per Cup") +
  theme_bw() +
  theme(axis.title.y = element_blank(),
        axis.title.x  = element_text(size = 30),
        plot.subtitle = element_text(size = 32),
        axis.text = element_text(size = 22),
        axis.text.x = element_text(angle = 13)) +
  scale_y_continuous(limits = c(75,225))

Piping into ggplot()

Creating a Game Plan

Creating a Game Plan

Just like when creating graphics with ggplot, wrangling data with dplyr involves thinking through many steps and writing many layers of code.

  • To help us think through a wrangling problem, we are going to create a game plan before we start writing code.

Thinking about Data Ethics

Data Ethics

1. What do we mean by data ethics?





2. Why do we (as statisticians, data scientists, folks working with data) need to think about data ethics?

Data Ethics

1. What do we mean by data ethics?

  • The process of evaluating data collection, processing, analysis, and dissemination practices for their adverse impacts on individuals, systems, and society.

2. Why do we (as statisticians, data scientists, folks working with data) need to think about data ethics?

  • We have a lot of power to declare truth and fact, hiding behind the black box of data science methods.

From Hippocratic Oath to Data Science Oath

  • I will not be ashamed to say, “I know not,” nor will I fail to call in my colleagues when the skills of another are needed.
  • I will respect the privacy of my data subjects, for their data are not disclosed to me that the world may know.
  • I will remember that my data are not just numbers without meaning or context, but represent real people and situations, and that my work may lead to unintended societal consequences, such as inequality, poverty, and disparities due to algorithmic bias.

ASA Ethical Guidelines

  • The American Statistical Association’s Ethical Guidelines for Statistical Practice are intended to help statistics practitioners make decisions ethically.
  • They aim to promote accountability by informing those who rely on statistics of the standards they should expect.

Visit Ethical Guidelines for Statistical Practice and discuss one of the guidelines with your partner.

  • What surprises you? What did you learn?
  • In what scenario might this come into play?

Institutional Review Board

  • IRB reviews help to ensure that research participants are protected from research-related risks and treated ethically.
  • This is a necessary prerequisite for maintaining the public’s trust in research and allowing science to advance for the common good.

Note

Watch a video about IRB to learn more.

Lab 3: Familiarity with AAE + Challenge 3: Demographic Comparisons + Data Ethics

To do…

  • Lab 3: Familiarity with AAE
    • Due Friday, 4/21 at 11:59pm
  • Challenge 3: Demographic Comparisons & Data Ethics
    • Due Saturday, 4/22 at 11:59pm
  • Read Chapter 4: Data Joins and Transformations
    • Concept Check 4.1 + 4.2 due Monday 4/24 at 10:00am